This project is inspired by Moneyball: The Art of Winning an Unfair Game, a book by Michael Lewis, published in 2003, about the Oakland Athletics baseball team and its general manager Billy Beane. Its focus is the team’s analytical, evidence-based, sabermetric approach to assembling a competitive baseball team despite Oakland’s small budget (Wiki source).
In 2001 the Oakland Athletics team salary was only 34 million dollars compared to the league leading 112 million dollar New York Yankees.
The discrepancy in team salary is not as drastic in the NBA.
Here I took a web scraping function from mbjoseph to scrape NBA team salaries.
- Web scraping hoopshype.com/salaries…
## [1] "Loaded 30 team salaries."
Here are the (6) largest team salaries:
## Team X2020.21
## 1 Golden State 170722835
## 2 Brooklyn 169052492
## 3 Philadelphia 147806853
## 4 LA Clippers 139722606
## 5 LA Lakers 138326578
## 6 Utah 136048278
Boxplot of team salary:
In 2001 the Oakland A’s paid about 1/2 the median MLB team salary. With this in mind we will attempt to build a competitive team with 1/2 the median NBA team salary. Here we see the median team salary in the NBA is roughly $130 million. That leaves us with 65 million dollars to work with.
Hypothesis
Similar to baseball, we can create a competitive team with a salary of $65M.
What constitutes a competitive team?
Well, that’s not a simple question to answer. We need a team with players who contribute both offensively and defensively. First, we will analyze plus-minus statistics
Plus-Minus
- Plus-Minus, a.k.a. +/-, simply keeps track of the net changes in the score when a given player is either on or off the court.
Real Plus-Minus
Real Plus-Minus can be broken down into offensive and defensive metrics:
Offensive Real Plus-Minus: (ORPM): Player’s average impact on his team’s offensive performance, by the points scored per 100 offensive possessions.
Defensive Real Plus-Minus: (DRPM): Player’s average impact on his team’s defensive performance, by the points allowed per 100 offensive possessions.
RPM Wins
- RPM Wins provide an estimate of the number of wins each player has contributed to his team’s win total on the season. RPM Wins include the player’s Real Plus-Minus and his number of possessions played.
Before analyzing player statistics we need to know how much each player is worth. Let’s load individual player salaries.
- Web scraping hoopshype.com/salaries/players/…
## [1] "Loaded 578 player salaries."
Boxplot of NBA player salaries:
Curious how much your favorite player is being paid this season? Search and see!
- Web scraping espn.com/nba/statistics/…
## [1] "Loaded 10 statistics for 530 players."
## RK Player TEAM GP MPG ORPM DRPM RPM WINS POS
## 1 1 Stephen Curry GS 62 34.1 7.10 0.15 7.24 19.26 PG
## 2 2 LeBron James LAL 43 33.7 4.67 2.06 6.74 11.72 SF
## 3 3 Rudy Gobert UTAH 68 30.9 -1.45 7.71 6.26 15.61 C
## 4 4 Paul George LAC 53 33.7 1.62 3.85 5.46 11.83 SG
## 5 5 Joel Embiid PHI 49 31.4 2.31 2.94 5.24 10.13 C
## 6 6 Giannis Antetokounmpo MIL 59 32.9 3.86 1.09 4.95 13.31 PF
Add player salary as final column to player stats
## RK Player TEAM GP MPG ORPM DRPM RPM WINS POS X2020.21
## 1 1 Stephen Curry GS 62 34.1 7.10 0.15 7.24 19.26 PG 43006362
## 2 2 LeBron James LAL 43 33.7 4.67 2.06 6.74 11.72 SF 39219566
## 3 3 Rudy Gobert UTAH 68 30.9 -1.45 7.71 6.26 15.61 C 26775281
## 4 4 Paul George LAC 53 33.7 1.62 3.85 5.46 11.83 SG 35450412
## 5 5 Joel Embiid PHI 49 31.4 2.31 2.94 5.24 10.13 C 29542010
## 6 6 Giannis Antetokounmpo MIL 59 32.9 3.86 1.09 4.95 13.31 PF 27528088
Analyzing RPM vs. Salary
What’s going on?
The chart on the left is a plain scatterplot mess. On the right we can understand the data a little bit more.
- In general, players that play more minutes are paid more and have a higher plus/minus. This might make things difficult for us because we need a competitive team that is cheap!
Time to collect more data
Our Plus/Minus stats come from NBA.com.
Also, we will web scrape the data thanks to some help from our guy Ashwin.
- Web scraping stats.nba.com/stats/…
## [1] "Loaded 65 statistics for 537 players."
Now that we have all the individual player statistics we could ever need to analyze a player’s value we will shift over to compile NBA team statistics.
- Web scraping nba.com/standings…
## [1] "Loaded 81 stats for 30 teams."
## [1] "Cleaned and reduced to 8 stats (including salary) for 30 teams."
Why the team stats?
These stats will be enable us to test how competitively our team stacks up against other teams in the NBA.
## Team TeamName W L PPG OppPPG DiffPointsPG X2020.21
## 1 Utah Jazz 50 19 116.7 107.5 9.2 136048278
## 2 Philadelphia 76ers 47 22 113.6 108.1 5.5 147806853
## 3 Phoenix Suns 48 21 114.8 109.3 5.5 128858241
## 4 Brooklyn Nets 45 24 118.6 114.4 4.1 169052492
## 5 Milwaukee Bucks 44 25 119.9 114.0 5.9 135449418
## 6 LA Clippers 46 23 114.0 107.8 6.3 139722606
Team Salary vs. Team Wins
Let’s see if spending more money translates to winning more games.
Looking at the data another way we observe a pattern that might be obvious.
Team Wins vs. Team Diff PPG
Wins and DiffPointPG
- These stats are generally correlated
- Higher DiffPointPG translates to more Wins
- However, higher salary doesn’t always correlate with more wins
- There are lower salary teams with better PPG and Wins!
This finding gives us hope. It opens the door a little bit more for being able to build a winning team on a low budget.
Now what?
Well, we have evidence that it is possible to be a competitive team with a lower end budget. Let’s see how far we can stretch the limits now that we have all the player and team statistics we need for analysis.
First, let’s filter out players…
- With few minutes (we need players!)
- And a Plus/Minus < 200 (we need good players)
- With salaries over $15M (we need affordable players)
## [1] "Now we have 15 players to analyze."
Here is the composition of the team:
##
## C PF PG SF SG
## 1 1 3 2 5
Here is the team:
## Player POS MPG P_M_PG X2020.21
## 1 Georges Niang SF 15.8 4.782609 1783557
## 2 Reggie Jackson PG 23.2 3.723077 2331593
## 3 Donte DiVincenzo SG 27.4 4.500000 3044160
## 4 Mikal Bridges SF 33.0 4.231884 4359000
## 5 Pat Connaughton SG 22.9 3.424242 4938273
## 6 Donovan Mitchell SG 33.4 5.415094 5195501
## 7 Trae Young PG 34.1 3.766667 6571800
## 8 Seth Curry SG 28.9 5.527273 7834449
## 9 Royce O'Neale PF 31.7 6.602941 8500000
## 10 Deandre Ayton C 30.7 3.971014 10018200
## 11 Joe Ingles SG 27.8 6.546875 10363637
## 12 Jordan Clarkson PG 26.6 4.492308 11500000
## [1] "The salary is 76.4 M"
From the analysis of plus/minus we filtered out a team that is slightly outside of our budget and guard heavy. Not bad for a first attempt. Let’s keep going with real plus minus.
Attempt 2: let’s filter out players…
- With few minutes (we need players!)
- And a Real Plus Minus < 1.8 (we need good players)
- With salaries over $15M (we need affordable players)
## [1] "Now we have 15 players to analyze."
Here is the composition of the team:
##
## C PF PG SF SG
## 1 2 3 3 3
Here is the team:
## Player POS MPG RPM X2020.21
## 1 Duncan Robinson SG 31.7 2.83 1663861
## 2 Donte DiVincenzo SG 27.4 2.24 3044160
## 3 John Collins PF 29.6 2.19 4137302
## 4 Mikal Bridges SF 33.0 1.97 4359000
## 5 Bam Adebayo C 33.5 2.93 5115492
## 6 Donovan Mitchell SG 33.4 2.54 5195501
## 7 Trae Young PG 34.1 2.12 6571800
## 8 Luka Doncic PG 34.5 2.87 8049360
## 9 De'Aaron Fox PG 35.1 2.72 8099627
## 10 Jae Crowder PF 27.4 1.86 9258000
## 11 Kyle Anderson SF 27.3 2.48 9505100
## 12 Jayson Tatum SF 35.9 2.42 9897120
## [1] "The salary is 74.9 M"
From the analysis of real plus/minus, again, we filtered out a team that is slightly outside of our budget, but looking more like a normal lineup. Not bad for a second attempt. Let’s keep going with real plus/minus wins.
Attempt 3: let’s filter out players…
- With few minutes (we need players!)
- And a RPM WINS < 5 (we need good players)
- With salaries over $15M (we need affordable players)
## [1] "Now we have 26 players to analyze."
Here is the composition of the team:
##
## C PF PG SF SG
## 1 1 3 2 5
Here is the team:
## Player POS MPG WINS X2020.21
## 1 Duncan Robinson SG 31.7 9.73 1663861
## 2 Kevin Huerter SG 31.0 5.37 2761920
## 3 Donte DiVincenzo SG 27.4 7.15 3044160
## 4 John Collins PF 29.6 6.83 4137302
## 5 Reggie Bullock SF 29.6 5.83 4200000
## 6 Mikal Bridges SF 33.0 8.28 4359000
## 7 Bam Adebayo C 33.5 9.19 5115492
## 8 Donovan Mitchell SG 33.4 7.65 5195501
## 9 Trae Young PG 34.1 7.80 6571800
## 10 Luka Doncic PG 34.5 9.93 8049360
## 11 De'Aaron Fox PG 35.1 9.04 8099627
## 12 RJ Barrett SG 34.9 5.64 8231760
## [1] "The salary is 61.4 M"
Ayyoo!
Now we are in business. Out of the 24 players selected we chose to stick with the 12 players that were the least expensive. As of May 6th, we beat our goal by $3.6 million. This is subject to change because the web scraping pulls live data and the season is still underway.
Now, the real question:
Can this newly created $Bball team compete with current teams?
- Let’s test by pro-rating RPM WINS vs. existing league leaders’ RPM WINS
## [1] "The top team is: Utah with 50 wins and 19 losses."
Our team will play a similar style to other NBA teams:
- players 1-3 will play 34 minutes per game (mpg),
- players 4-8 will play 24 mpg,
- players 9 & 10 will play 9 mpg,
- and players 11 & 12 will be reserves.
## [1] "Our $Bball team, with a salary of 61.4M, has a projected W-L record of 63-6!!!"
Success!
We are on track to head into the playoffs as the number 1 overall team! At this rate we may even post the best record of all-time.